Discussion of “ Bayesian Nonparametric Latent Feature Models ” by Zoubin Ghahramani
نویسندگان
چکیده
Ghahramani and colleagues have proposed an interesting class of infinite latent feature (ILF) models. The basic premise of ILF models is that there are infinitely many latent predictors represented in the population, with any particular subject having a finite selection. This is presented as an important advance over models that allow a finite number of latent variables. ILF models are most useful when all but a few of the features are very rare, so that one obtains a sparse representation. Otherwise, one cannot realistically hope to learn about the latent feature structure from the available data. The utility of sparse latent factor models has been compellingly illustrated in large p, small n problems by West (2003) and Carvalho et al. (2006). Given that performance is best when the number of latent features represented in the sample is much less than the sample size, it is not clear whether there are practical advantages to the ILF formulation over finite latent variable models that allow uncertainty in the dimension. For example, Lopes and West (2004) and Dunson (2006) allow the number of latent factors to be unknown using Bayesian methods. That said, it is conceptually appealing to allow additional features to be represented in the data set as additional subjects are added, and it is also appealing to allow partial clustering of subjects. In particular, under an ILF model, subjects can have some features in common, leading to a degree of similarity based on the number of shared features and the
منابع مشابه
General Latent Feature Modeling for Data Exploration Tasks
This paper introduces a general Bayesian nonparametric latent feature model suitable to perform automatic exploratory analysis of heterogeneous datasets, where the attributes describing each object can be either discrete, continuous or mixed variables. The proposed model presents several important properties. First, it accounts for heterogeneous data while can be inferred in linear time with re...
متن کاملCorrelated Non-Parametric Latent Feature Models
We are often interested in explaining data through a set of hidden factors or features. When the number of hidden features is unknown, the Indian Buffet Process (IBP) is a nonparametric latent feature model that does not bound the number of active features in dataset. However, the IBP assumes that all latent features are uncorrelated, making it inadequate for many realworld problems. We introdu...
متن کاملGeneral Table Completion using a Bayesian Nonparametric Model
Even though heterogeneous databases can be found in a broad variety of applications, there exists a lack of tools for estimating missing data in such databases. In this paper, we provide an efficient and robust table completion tool, based on a Bayesian nonparametric latent feature model. In particular, we propose a general observation model for the Indian buffet process (IBP) adapted to mixed ...
متن کاملNonparametric Bayesian Sparse Factor Models with application to Gene Expression modelling
A nonparametric Bayesian extension of Factor Analysis (FA) is proposed where observed data Y is modeled as a linear superposition, G, of a potentially infinite number of hidden factors, X. The Indian Buffet Process (IBP) is used as a prior on G to incorporate sparsity and to allow the number of latent features to be inferred. The model’s utility for modeling gene expression data is investigated...
متن کامل